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Abstract 

Finite metric spaces arise in many different contexts. Enormous bodies 
of data, scientific, commercial and others can often be viewed as large metric 
spaces. It turns out that the metric of graphs reveals a lot of interesting 
information. Metric spaces also come up in many recent advances in the 
theory of algorithms. Finally, finite submetrics of classical geometric objects 
such as normed spaces or manifolds reflect many important properties of the 
underlying structure. In this paper we review some of the recent advances in 
this area. 
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1. Introduction 

The constantly intensifying ties between combinatorics and geometry are among 
the most significant developments in Discrete Mathematics in recent years. These 
connections are manifold, and it is, perhaps, still too early to fully evaluate this 
relationship. This article deals only with what might be called the geometrization 
of combinatorics. Namely, the idea that viewing combinatorial objects from a ge- 
ometric perspective often yields unexpected insights. Even more concretely, we 
concentrate on finite metric spaces and their embeddings. 

To illustrate the underlying idea, it may be best to begin with a practical 
problem. There are many disciplines, scientific, technological, economic and oth- 
ers, which crucially depend on the analysis of large bodies of data. Technological 
advances have made it possible to collect enormous amounts of interesting data, 
and further progress depends on our ability to organize and classify these data so 
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as to allow meaningful and insightful analysis. A case in point is bioinformatics 
where huge bodies of data - DNA sequences, protein sequences, information about 
expression levels etc. all await analysis. Let us consider, for example, the space of 
all proteins. For the purpose of the current discussion, a protein may be viewed as 
a word in an alphabet of 20 letters (amino acids). Word lengths vary from under 
fifty to several thousands, the most typical length being several hundred letters. At 
this writing, there are about half a million proteins whose sequence is known. Algo- 
rithms were developed over the years to evaluate the similarity of different proteins, 
and there are standard computer programs that calculate distances among proteins 
very efficiently. This turns the collection of all known proteins into a metric space of 
about half a million elements. Proper analysis of this space is of great importance 
for the biological sciences. Thus, this huge body of sequence data takes a geomet- 
ric form, namely, a finite metric space, and it becomes feasible to use geometric 
concepts and tools in the analysis of this data. 

In the combinatorial realm proper, and in the design and analysis of algo- 
rithms, similar ideas have proved very useful as well. A graph is completely char- 
acterized by its (shortest path, or geodesic) metric. The analysis of this metric 
provides a lot of useful information about the graph. Moreover, given a graph 
G, one may modify G's metric by assigning nonnegative lengths to G's edges. By 
varying these edge lengths, a family of finite metrics is obtained, the properties of 
which reflect a good deal of structural information about G. We mention in passing 
that there are other useful and interesting geometric viewpoints of graphs. Thus, 
it is useful to geometrically realize a graph by assigning vectors to the vertices and 
posit that adjacent vertices correspond to orthogonal vectors. Graphs can encode 
the intersection patterns of geometric objects. These are all interesting instances of 
our basic paradigm: In the study of combinatorial objects, and especially graphs, 
it is often beneficial to develop a perspective from which the graph is perceived 
geometrically. 

Aside from what has already been thus accomplished, this approach holds a 
great promise. Combinatorics as we know it, is still a very young subject. (There is 
no official date of birth, and Euler was undoubtedly a giant in our field, but I think 
that the dawn of modern combinatorics can be dated to the 1930's). Discrete Math- 
ematics stands to gain a lot from interactions with older, better established fields. 
This geometrization of combinatorics indeed creates clear and tangible connections 
with various subfields of geometry. So far the study of finite metric spaces has 
had substantial connections with the theory of finite-dimensional normed spaces, 
but it seems safe to predict that useful ties with differential geometry will soon 
emerge. With the possible incorporation of probabilistic tools, now commonplace 
in combinatorics, we can expect very exciting outcomes. 

A good sign for the vitality of this area is the large number of intriguing open 
problems. We will present here some of those that we particularly like. In a recent 
meeting (Haifa, March '02), a list of open problems in this area has been collected, 
see http://www.kam.mff.cuni.cz/matousek/haifaop.ps. More extensive surveys of 
this area can be found in [Mat02] Chapter 15, and [IndOl]. 

In view of this description, it should not come as a surprise to the reader that 
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this theory is characterized as being 

• Asymptotic: We are mostly interested in analyzing large, finite metric spaces, 
graphs and data sets. 

• Approximate: While it is possible to postulate that the geometric situation 
agrees perfectly with the combinatorics, it is much more beneficial to investi- 
gate the approximate version. This leads to a richer theory that is quantitative 
in nature. Rather than a binary question whether perfect mimicking is possi- 
ble or not, we ask how well a given combinatorial object can be approximated 
geometrically. 

• Algorithmic: Existential results are very important and interesting in this 
area, but we always prefer it when such a result is accompanied by an efficient 
algorithm. 

• It is mostly comparative: There are certain classes of finite metric spaces 
that we favor. These may have a particularly simple structure or be very 
well understood. Other, less well behaved spaces are being compared to, and 
approximated by, these "nice" metrics. 

So, how should we compare between two metrics? Let (X, d) and (Y, p) be 
two metric spaces and let ip : X — > Y be a mapping between them. We quan- 
tify the extent to which ip expands, resp. contracts distances: expansion(</?) = 

su Px, v ex p(V d(i ,V v ) V)) and contraction(^) = sup^ eX rt*&Mv)V 

Finally, the main definition is: distortion(<p) = expansion^) • contraction^). 

In other words, we consider the tightest constants a > (3 for which a > 
P ^ V d(xy) V ^ — $ a l wa y s holds, and define distortion(</?) as ||. We call ip an isometry 
when distortion(^) = 1. This deviates somewhat from the conventional definition, 
and a map that multiplies all distances by a constant (not necessarily 1) is being 
considered here as an isometry. 

The least distortion with which (X, d) can be embedded in (Y, p) is denoted 
cy(X) = Cy(X, d). If C is a class of metric spaces, then the infimum of cy(X) over 
all Y e C is denoted by cc(X). When C is the class of finite-dimensional l p spaces 
{lp\n = 1, 2, . . . } we denote cc(X) by c p (X). 

One of the major problems in this area is: 

Problem 1. Given a finite metric space (X, d) and a class of metrics C, find the 
(nearly) best approximation for X by a metric from C. In other words, find a metric 
space Y <G C and a map ip : X — > Y such that distortion(^) (nearly) equals cc(X). 

The classes of metric spaces C for which this problem has so far been studied 
are: (i) Metrics of normed spaces, especially l p for oo > p > 1 and n = 1,2,.... 
(ii) Metrics of special families of graphs, most notably trees, as well as convex 
combinations thereof. 

One more convention: Speaking of l p , either means infinite dimensional l p , or, 
what is often the same, that we do not care about the dimension of the space in 
which we embed a given metric. 

To get a first feeling for this subject, let us consider the smallest nontrivial 
example. Every 3-point metric embeds isometrically into the plane, but as we show 
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now, the metric of ^1,3, the 4-vertex tree with a root and three leaves, has no 
isometric embedding into l 2 . Let x, resp. yt be the image of the root and the leaves 
of this tree. Since d(x, yi) = 1 and d(y i} yj) — 2 for all i 7^ j, it follows that the three 
points x, yi, yj are colinear for every i ^ j. Thus, all four points arc colincar, leading 
to a contradiction. It can be shown that the least distorted image of this graph in 
l 2 is in the plane with 120° degree angle among the edges. Below (Section 2.) we 
present a polynomial-time algorithm that determines c 2 (X), the least l 2 distortion 
for any finite metric (X, d) . 

Another easy fact which belongs into this warm-up section is that Coo(X) = 1 
for every finite metric (X, d). That is, the space space contains an isometric 
copy of every finite metric space. 

Acknowledgment: Helpful remarks on this article by R. Krauthgamer, A. Magen, 
J. Matousek, and Yu. Rabinovich are gratefully acknowledged. 

2. Embedding into I2 

This is by far the most developed part of the theory. There are several good 
reasons for this part of the theory to have attracted the most attention so far. 
Consider the practical context, where a metric space represents some large data 
set, and where the major driving force is the search for good algorithms for data 
analysis. If the data set you need to analyze happens to be a large set of points 
in l 2 , there are many tools at your disposal, from geometry, algebra and analysis. 
So if your data can be well approximated in l 2 , this is of great practical advantage. 
There is another reason for the special status of l 2 in this area. To explain it, we 
need to introduce some terminology from Banach space theory. The Banach-Mazur 
distance among two normed spaces X and Y, is said to be < c, if there is a linear 
map (p : X — > Y with distortion(^) < c. What we are doing here may very well 
be described as a search for the metric counterpart of this highly developed linear 
theory. See [MS86] for an introduction to this field and [BL00] for a comprehensive 
cover of the nonlinear theory. The grandfather of the linear theory is the celebrated 
theorem of Dvoretzky [Dvo61]. 

Theorem 1 (Dvoretzky). For every n and e > 0, every n- dimensional normed 
space contains a k = fl(e 2 ■ log n)- dimensional space whose Banach-Mazur distance 
from l 2 is < 1 + e. 

Thus, among embeddings into normed spaces, embeddings into l 2 are the hard- 
est to come by. 

We begin our story with an important theorem of Bourgain [Bou85] . 

Theorem 2. Every n-point metric space 1 embeds in l 2 with distortion < O(logn). 

Not only is this a fundamental result, Bourgain's proof of the theorem readily 
translates into an efficient randomized algorithm that finds, for any given finite 
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{X, d) an embedding in Z 2 of distortion < O(logn). The algorithm is so simple that 
we record it here. Given the metric space (X, d), we map every point x € X to 
p(x), an 0(log 2 n)-dimensional vector. Coordinates in ip(-) correspond to subsets 
S C X, and the S'-th coordinate in <p(x) is simply d{x, S), the minimum of d(x,y) 
over all y e S. To define the map tp, we need to specify, then, the collection of 
subsets S that we utilize. These sets are selected randomly. Namely, you randomly 
select O(logn) sets of size 1, another O(logn) sets of size 2, of size 4, 8..., I|. 

In view of Bourgain's Theorem, several questions suggest themselves naturally: 

• Is this bound tight? The answer is positive, see Theorem 3. 

• Given that maxc 2 (l) over all n-point metrics is 0(logn), what about metrics 
that are closer to Z 2 ? Is there a polynomial-time algorithm to compute C2{X, d) 
(That is, the least distortion in an embedding of X into Z 2 )? Again the answer 
is affirmative, see below and Theorem 4. 

• Are there interesting families of metric spaces for which C2 is substantially 
smaller than logn? Indeed, there are, see, e.g., Theorem 5. 

So let us proceed with the answers to these questions. Expanders are graphs 
which cannot be disconnected into two large subgraphs by removing relatively few 
edges. Specifically, a graph G on n vertices is said to be an e- (edge) -expander if, 
for every set S of < n/2 vertices, there are at least e|5| edges between S and its 
complement. It is said to be k-regular if every vertex has exactly k neighbors. 
The theory of expander graphs is a fascinating chapter in discrete mathematics 
and theoretical computer science. It is not obvious that arbitrarily large fc-regular 
graphs exist with expansion e bounded away from zero. In fact, in the early days of 
this area, conjectures to the contrary had been made. It turns out, however, that 
expanders are rather ubiquitous. For every k > 3, the probability that a randomly 
chosen fc-regular graph has expansion e > fc/10 tends to 1 as the number of vertices 
n tends to 00. It turns out that the metrics of expander graphs are as far from Z 2 
as possible. 2 

Theorem 3 ([LLR95], see also [Mat97, LMOO]). LetG be an n-vertex k-regular 
e-expander graph (k > 3, e > 0). Then c 2 (G) > clogn where c depends only on k 
and e. 

Metric geometry is by no means a new subject, and indeed metrics that embed 
isometrically into I2 were characterized long ago (see e.g. [Blu70]). This is a special 
case of the more recent results. Let ip : X — > Z 2 be an embedding. The condition 
that distortion(</?) < c can be expressed as a system of linear inequalities in the 
entries of the Gram matrix corresponding to the vectors in p(X). Therefore, the 
computation of c 2 (X) is an instance of semidefinite quadratic programming and can 
be found in polynomial time. 3 This formulation of the problem has, however, 
other useful consequences. The duality principle of convex programming yields a 
max-min formula for c 2 . 

2 We freely interchange between a graph and its (shortest path) metric. 

3 This is not quite accurate. Given an n-point space (X, d) and e > 0, the algorithm can 
determine C2(X, d) with relative error < e in time polynomial in n and i. 
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Theorem 4 ([LLR95]). For every finite metric space (X,d) 



c-i (X, d) = max 



i,j-<li,j >0 



d 2 (i,j)qi,j 



i,r-qi,3<o 



where the maximum is over all matrices Q so that 

1. Q is positive semidefinite, and 

2. The entries in every row in Q sum to zero. 

Consider the metric of the r-dimcnsional cube. As shown by Enflo [Em°69], the 
least distorted embedding of this metric is simply the identity map into l r 2 , which 
has distortion y 7 ?. Our first illustration for the power of the quadratic programming 
method is that we provide a quick elementary proof for this fact, earlier proofs of 
which required heavier machinery. The rows and columns of the matrix Q are 
indexed by the 2 r vertices of the r-dimcnsional cube. The (x, y) entry of Q is: (i) 
r — 1 if x = y, (ii) It is —1 if a; and y are neighbors (they are represented by two 0, 1 
vectors that differ in exactly one coordinate, and (iii) It is 1 if x and y are antipodal, 
i.e., they differ in all r coordinates, (iv) All other entries of Q are zero. We leave 
out the details and only indicate how to prove that Q is positive semidefinite. It 
is possible to express Q = (r — 1)1 — A + P, where A is the adjacency matrix of 
the r-cube and P is the (permutation) matrix corresponding to being antipodal. 
The eigenfunctions of A are well known, namely, they are the 2 r Walsh functions. 
The same vectors happen to be also the eigenvectors of Q and all have nonnegativc 
eigenvalues. 

As another application of this method (also from [LM00]), here is a quick proof 
of Theorem 3. It is known [AI086] that if G is a k- regular e-expander graph and 
A is G"s adjacency matrix, then the second eigenvalue of A is < k — S for some 
S that depends on k and e, but not on the size of the graph . It is not hard to 
show that the vertices of a graph with bounded degrees can be paired up so that 
every two paired vertices are at distance f2(log n )- Let P be the permutation matrix 
corresponding to such a pairing. It is not hard to establish Theorem 3 using the 
matrix Q = kl — A+ | (P — I). More sophisticated applications of this method will 
be described below (Theorem 7). 

3. Specific families of graph metrics 

For various graph families, it is possible find embeddings into I2 with distortion 
asymptotically smaller than log n. This often applies as well to graphs with arbitrary 
nonnegative edge lengths. 



4 A's first eigenvalue is clearly k. This is the combinatorial analogue of Cheeger's Theo- 
rem [Che70] about the spectrum of the Laplacian. 



3.1. 



Trees 
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The metrics of trees are quite restricted. They can be characterized through a 
four- term inequality (e.g. [DL97]). It is also not hard to see that every tree metric 
embeds isometrically into l\. They can also be embedded into l 2 with a relatively 
low distortion. 

Theorem 5 (Matousek [Mat99]). Every tree on n vertices can be embedded into 
l 2 with distortion < 0(\/log log n). 

Bourgain [Bou86] had earlier shown that this bound is attained for complete 
binary trees. (See [LS] for an elementary proof of this.) 

3.2. Planar graphs 

It turns out that the metrics of planar graphs have good embedding into l 2 . 
Rao [Rao99] showed: 

Theorem 6. Every planar graph embeds in l 2 with distortion 0(\/\ogn). 

A recent construction of Newman and Rabinovich [NR02] shows that this 
bound is tight. 

3.3. Graphs of high girth 

The girth of a graph is the length of the shortest cycle in the graph. If you 
restrict your attention (as we do in this section) to graphs in which all vertex 
degrees are > 3, then it is still a major challenge to construct graphs with very high 
girth, i.e., having no short cycles. The metrics of such graphs seem far from l 2 , so 
in [LLR95] it was conjectured that c 2 (G) > £l(g) for every graph G of girth g in 
which all vertex degrees are > 3. There are known examples of n-vertex fe-regular 
expanders whose girth is O(logn). In view of Theorem 2, such graphs show that 
this conjecture, if true, is best possible. Recently the following was shown: 

Theorem 7 ([LMN]). Let G be a k-regular graph k > 3 with girth g. Then 
c 2 (G) > Sl(y/g). 

Two proofs of this theorem are given in [LMN] . One is based on the notion of 
Markov Type due to Ball [Bal92]. The underlying idea of this proof is that a random 
walk on a graph with girth g and all vertex degree > 3 drifts at a constant speed 
away from its starting point for time £l(g). On the other hand, in an appropriately 
defined class of random walks in Euclidean space, at time T the walk is expected 
to be only 0(VT) away from its origin. If wc compare between the graph itself and 
its image under an embedding in l 2l this discrepancy must be accounted for by a 
metrical distortion. The comparison at time T = Q(g) yields a distortion of il(^/g). 

The other proof again employs semidefinite programming, using the matrix 
Q = al — A + (3B. Here A is the graph's adjacency matrix, and B is a 0, 1 matrix 
where B xy = 1 if x and y are at distance g/2 in G. The parameters a and (3 have to 
satisfy the two conditions from Theorem 4. A key observation is that due to the high 
girth, B can be expressed as P g / 2 {A) where Pj is the j-th Geronimus Polynomial, 
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a known family of orthogonal polynomials. The proof depends on the distribution 
of zeros for these polynomials, and other analytical properties that they have. 
Our present state of knowledge leads us to ask: 

Open Problem 1. How small can c 2 (G) be for a a graph G of girth g in which all 
vertices have degree > 3? The answer lies between £l(y/g) and O(g). 

An earlier result of Rabinovich and Raz [RR98] reveals another connection 
between high girth and distortion. Let ip be a map from a graph of girth g to a 
graph of smaller Euler characteristic (\E\ — \V\ + 1). Then distortion(^) > Q(g). 



4. Algorithmic applications 

Among the most pleasing aspects of this field, arc the many beautiful appli- 
cations it has to the design of new algorithms. 

4.1. Multicommodity flow and sparsest cuts 

Flows in networks are a classical subject in discrete optimization and a topic 
of many investigations (see [Sch02] for a comprehensive coverage). You are given 
a network i.e., a graph with two specified vertices: The source s and the sink t. 
Edges have nonnegative capacities. The objective is to ship as much of a given 
commodity between s and t, subject to two conditions: (i) In every vertex other 
than s and t, matter is conserved, (ii) The flow through any edge must not exceed 
the edge capacity. Let the set 5 separate the vertices s and t, i.e., it contains 
exactly one of them. Define 5's capacity as the sum of edge capacities over those 
edges that connect 5 to its complement. The Max-flow Min-cut Theorem states 
that the largest possible flow equals the minimum such capacity. 

Here we consider the k- commodity version: Now there are k source-sink pairs 
Si, ti, i = 1,2, k for the i-th commodity, and the i-th demand is Di > 0. We seek 
to determine the largest (ft > for which it is possible to flow <fi ■ Di of the i-th 
commodity between Si and t i: simultaneously for all k > i > 1 subject to conditions 
(i) and (ii) above where in (ii) the total flow through an edge should not exceed 
its capacity. With every subset of the vertices 5 we associate 7(5) = dcm(s) ' ^ S 
before, cap(5) is the sum of the capacities of edges between 5 and its complement. 
The denominator dem(5) is over all indices i so that 5 separates and tj. 

It is trivially true that (f> < 7(5), for every flow and every set 5, but unlike the 
one-commodity case, min7(5) (the sparsest cut) need not equal max0. As for the 
algorithmic perspective, finding max</> is a linear program, so it can be computed 
in polynomial time. However, it is TVP-hard to determine the sparsest cut. Also, 
it is interesting to find out how far max(/> and min7(5) can be. Consider the case 
where the underlying graph is an expander, edges have unit capacities and every 
pair of vertices form a source-sink pair with a unit demand. It is not hard to see 
that in this case <j) < 0( m \ n ' 1 ^ ). On the other hand, 
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Theorem 8 ([LLR95], see also [AR98]). In the k-commodity problem 

max</> > 0( ^ . 

logfc 

We will be able to review the proof in Section 5.. 

4.2. Graph bandwidth 

In this computational problem, we are presented with an n-vertex graph G. 
It is required to label the vertices with distinct labels from {1, . . . ,n} so that the 
difference between the labels of any two adjacent vertices is not too big. Namely, 

bw(G) = min max \ip(x) — "4>(y)\, 

4> xy£E(G) 

where the minimum is over all 1 : 1 maps tp : V — > {1, . . . , n}. 

It is ./VP-hard to compute this parameter, and for many years no decent ap- 
proximation algorithm was known. However, a recent paper by Feige [FciOO] pro- 
vides a polylogarithmic approximation for the bandwidth. The statement of his 
algorithm is simple enough to be recorded here: 

1. Compute (a slight modification of) the embedding <p : G —* I2 that appears in 
the proof of Bourgain's Theorem 2. 

2. Select a random line / and project ip(G) onto it. 

3. Label the vertices of G by the order at which their images appear along the 
line 

Let /3(G) := max X;T . \ Br ^\ where B r (x) is the set of those vertices in G at distance 
< r from x. It's easy to see that bw(G) > Cl(f3(G)) and an interesting feature of 
Feige's proof is that it shows that bw(G) < 0([3(G) log c n). His paper gives c = 3.5 
which was later [DV99] improved to c = 3. 

Open Problem 2. Is it true that bw(G) < 0((3{G) log n)? 

It is not hard to see that this bound would be tight for expanders. 



4.3. Bartal's method 

The following general structure theorem of Bartal [Bar98] has numerous algo- 
rithmic applications: 

Theorem 9. For every finite metric space (X,d) there is a collection of trees 
{Ti I i e /}, each of which has X as its set of leaves, and positive weights {pi \ i G 1} 
with J^iPi = 1- Each of these tree metrics dominates d, i.e., distTi(x,y) > d(x,y) 
for every i and every x, y G X . On the other hand, for every x, y G X, 



^p % ■ dist Tz (x,y) < 0(logn • loglogn • d(x,y)). 

i 
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Bartal 's algorithmic paradigm is a general principle underlying the numerous 
algorithmic applications of this theorem: Given an algorithmic problem on input 
a graph or a general metric space (X, d), find a collection of tree metrics T t and 
weights pi as in Theorem 9. Select one of the trees at random, where Tj is selected 
with probability pi . Now solve the problem for input Tj. (This description assumes, 
and this is often the case, that the original optimization problem is iVP-hard in 
general, but feasible for tree metrics. 

There are two features of the proof that we'd like to mention: 
The trees are HSTs. In such trees, edge lengths decrease exponentially as you 
move from the root toward the leaves. They feature prominently in many recent 
developments in this area. 

The proof makes substantial use of sparse decompositions of graphs. Given a graph, 
one seeks a probability distribution on all partitions of the vertex set, so that (i) 
Parts have small diameters (ii) Adjacent vertices are very likely to reside in the same 
part. Such partitions have proved instrumental in the design of many algorithms. 
In fact, an important tool in Rao's Theorem 6 was an earlier result [KPR93] about 
the existence of very sparse partitions for the members of any minor-closed families 
family of graphs. 



5. The mysterious Zi 

We know much less about metric embeddings into l\, and the attempts to 
understand them give rise to many intriguing open problems. We start by defining 
the cut metric d$ on X where S C X, as follows: ds{x, y) — 1 if x, y arc separated 
by S and is zero otherwise. A simple, but useful observation is that the collection 
of all n-point metrics in l\ form a cone C whose extreme rays are the cut metrics. 5 
The book [DL97] provides a coverage of this area. 

We are now able to complete the proof of Theorem 8. We retain the termi- 
nology of the discussion around that theorem. Linear programming duality yields 
the following alternative expression for the maximum fc-commodity flow problem 
on G={V,E): 



max = mm 



a 



EiDj-disj,^) 



Here the minimum is over all graphical metrics d on G. Namely, you assign nonneg- 
ative lengths to G's edges and d is the induced shortest path metric on G's vertices. 
Now let d be the graphical metric that minimizes this expression. A slight adapta- 
tion of Bourgain's embedding algorithm yields an li metric p so that p(i, j) < d(i,j) 
for all i,j and p(sj,tj) > fl( d \ Sj f k ^ ) for all j. But the minimum of Jt E n P ^'^' C '% 
over h metrics is attained for p a cut metric, since cut metrics are the extreme rays 
of the cone of l\ metrics C. This minimum, over cut metrics is simply min7(5), the 
sparsest cut value of the network. The conclusion follows. 

The identification between l\ metrics and the cut cone C makes it desirable 
to find an algorithm to solve linear optimization problems whose feasible set is this 

5 For each n, the n-point metrics in ii form a cone C n , but we suppress the index n. 
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convex cone. Such an algorithm would solve at one fell swoop a host of interesting 
(and hard) problems such as max-cut, graph bisection and more. This hope is hard 
to realize, since the ellipsoid method (e.g. [Sch02]) applies only to convex bodies 
for which we have efficient membership and separation oracles. For the convex 
cone C, that would mean that we need to efficiently determine whether a given a 
real symmetric matrix M, represents the metric on n points in l\. Moreover, if 
not, we ought to find a hyperplane (in n 2 dimensions) that separates M from C. 
Unfortunately, these questions are TVP-hard (e.g. [DL97]). ft becomes, therefore, 
interesting to approximate the cone C. So, can we find another cone that is close to 
C and for which computationally efficient membership and separation oracles exist? 
There is a natural candidate for the job. We say that a matrix M is in square-l 2 , if 
there are points Xi in l 2 such that My = \\xi — Xj\\ 2 . Let S be the collection of all 
all square-^ matrices which are also a metric (i.e. the entries in M also satisfy the 
triangle inequality). It is not hard to see that CC5, but we ask: 

Open Problem 3. What is the smallest a = a(n), such that every n x n matrix 
M € S can be embedded in l\ with distortion < a ? 

It is not hard to see that every finite l 2 metric embeds isometrically into l\. 
But what about the opposite direction? 

Open Problem 4. Find max c 2 (X) over all (X, d) that arc n-point metrics in 
l\. As we saw above, for the n = 2 r vertices of the r-cubc the answer is y/r = 
\/log n. We suspect that this is the extreme case. No example is known where c 2 is 
asymptotically larger that \/logn. 

5.1. Dimension reduction 

Let us return to the applied aspect of this area. Even when a given metric 
space can be approximated well in some normed space, the dimension of the host 
space is quite significant. Data analysis and clustering in l 2 for large N is by no 
means easy. In fact, practitioners in these areas often speak about the curse of 
dimensionality when they refer to this problem. In l 2 there is a basic result that 
answers this problem. 

Theorem 10 (Johnson Lindenstrauss [JL84]). Every n-point metric in l 2 can 
be embedded into l 2 with distortion < 1 + e where k < 0( °% n ). 

Here, again, the proof yields an efficient randomized algorithm. Namely, select 
a random fc-dimensional subspace and project the points to it. 

What is the appropriate analogue of this theorem for l\ metrics? 

Open Problem 5. What is the smallest k = k(n, e) so that every n-point metric 
in l\ can be embedded into V[ with distortion < 1 + e? 

We know very little at the moment, namely Q(logn) < k < O(nlogn) for 
constant e > 0. The lower bound is trivial and the upper bound is from [Sch87, 
Tal90] . Note that if the truth is at the lower bound, then this provides an affirmative 
answer to Open Problem 4. 
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5.2. Planar graphs and other minor-closed families 

One of the most fascinating problems about l\ metrics is: 

Open Problem 6. Is there is an absolute constant C > so that every metric of 
a planar graph embeds into l\ with distortion < CI 

Even more daringly, the same can be asked for every minor-closed family 
of graphs. Some initial success for smaller graph families has been achieved al- 
ready [GNRS99]. 

5.3. Large girth 

Is there an analogue of Theorem 7 for embcddings into ?i? 

Open Problem 7. How small can C\(G) be for a a graph G of girth g in which all 
vertices have degree > 3? Specifically, can C\(G) stay bounded as g tends to oo? 

6. Ramsey- type theorems for metric spaces 

The philosophy of modern Ramsey Theory, (as developed e.g. in [GRS90]) 
can be stated as follows: Large systems necessarily contain substantial "islands of 
order". Dvoretzky's Theorem certainly falls into this circle of ideas. But what 
about the metric analogues? 

Open Problem 8. What is the largest /(•,•) so that every n-point metric (X, d) 
has a subset Y of cardinality > f(n,t) with c 2 (Y) < tl (We mean, of course, the 
metric d restricted to the set Y .) 

For t close to 1, the answer is known, namely, f(n,t) = 9(logn). For larger t 
the behavior is known to be different [BLMN]. 
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